CoZo+ - A Content Zoning Engine for textual documents

نویسندگان

  • Cynthia Wagner
  • Christoph Schommer
چکیده

Content zoning can be understood as a segmentation of textual documents into zones. This is inspired by [6] who initially proposed an approach for the argumentative zoning of textual documents. With the prototypical Cozo+ engine, we focus on content zoning towards an automatic processing of textual streams while considering only the actors as the zones. We gain information that can be used to realize an automatic recognition of content for pre-defined actors. We understand Cozo+ as a necessary pre-step towards an automatic generation of summaries and to make intellectual ownership of documents detectable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Muma: a Music Search Engine Based on Content Analysis

Existing music search engines are often limited to the textual modality (i.e., searching the textual metadata that are attached to music documents). We introduce here MUMA (http://muma.labs.exalead.com), a new search engine that relies both on textual metadata and signal processing metadata. MUMA allows the user to search for particular chords sequences, for specific moods, and to listen to aut...

متن کامل

Template for Regular Entry

DEFINITION The widespread search engines, in the professional as well as the personal context, used to work on the basis of textual information associated or extracted from indexed documents. Nowadays, most of the exchanged or stored documents have multimedia content. To reduce the technological gap so that these engines still can work on multimedia content, it is very convenient developing met...

متن کامل

A Classification Model for Mining Research Publications from Crowdsourced Data

Automatic access of natural language meaning is a prominent way of implementing search engines for document classification. The technique is difficult and often presents search results in rough approximates. It has minimal linguistic processing performed to identify content words like nouns and verbs in indexed documents. However, word frequency in documents can be taken as clues to their simil...

متن کامل

Ranking Techniques for Cluster Based Search Results in a Textual Knowledge-base

This paper presents a framework and methodology to improve the search experience in digital library systems. The approach taken is to cluster a textual knowledgebase along multiple relations and return search results in the form of small, focused clusters. Specifically, we generate multiple relationship networks, one per relationship type, and then cluster these networks. At search time, we pre...

متن کامل

The PROBADO-Framework: Content-Based Queries for non-textual Documents

In this paper we describe the system architecture of PROBADO, a project funded by the German Research Foundation (DFG). Its main goal is to provide a general library infrastructure for dealing with non-textual documents, in particular for content-based searching. PROBADO provides an infrastructure that allows integrating existing data repositories and content-based search engines into one commo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0811.0453  شماره 

صفحات  -

تاریخ انتشار 2008